Estimating speaker characteristics for speech recogni- tion

نویسندگان

  • Mats Blomberg
  • Daniel Elenius
چکیده

A speaker-characteristic-based hierarchic tree of speech recognition models is designed. The leaves of the tree contain model sets, which are created by transforming a conventionally trained set using leaf-specific speaker profile vectors. The non-leaf models are formed by merging the models of their child nodes. During recognition, a maximum likelihood criterion is followed to traverse the tree from the root to a leaf. The computational load for estimating one(vocal tract length) and fourdimensional speaker profile vectors (vocal tract length, two spectral slope parameters and model variance scaling) is reduced to a fraction compared to that of an exhaustive search among all leaf nodes. Recognition experiments on children’s connected digits using adult models exhibit similar recognition performance for the exhaustive and the one-dimensional tree search. Further error reduction is achieved with the four-dimensional tree. The estimated speaker properties are analyzed and discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Zero-crossing analysis technique in speech and speaker recognition

The resul ts of previous studies showed a possibili ty of application of zero-crossing rates as parameters for speech and speaker recognition The paper presents a method for obtaining these parameters and the resul ts of experiments carried out in order to estimate the absolute and relative discriminating power of zero-crossing parameters in speech and speaker recogni tion. As a basic test mate...

متن کامل

Distance between the pseudosections of the vocal tract as the criterion in the speaker recognition process

The procedure utilized in any approach to speaker identification, could substantially influence the resul ting level of the ul timate identification accuracy of the used technique. In this regard, two distinctly separate operational phases may be identified for any process of this type. First the identification parameters and associated measurement technique must be chosen. Secondly, statistica...

متن کامل

Speaker-independent connected letter recognition with a multi-state time delay neural network

We present a Multi-State Time Del ay Neural Network (MS-TDNN) for speaker-i ndependent, connected l etter recogni ti on. Our MS-TDNNachi eves 98. 5/92.0% word accuracy on speaker dependent/i ndependent Engl i sh l etter tasks[7, 8]. In thi s paper we wi l l summari ze several techni ques to improve (a) conti nuous recogni ti on performance, such as sentence l evel trai ni ng, and (b) phoneti c ...

متن کامل

Perception of Nasal Consonants with Special Reference to Catalan

In this paper I study. the role that different place cues play in the recogni tion of nasal stops. I claim that their perceptual relevance is strongly dependent on how they are related cIt the articulatory and acoustic levels and, essentially, on the nature of the process of speech perception itself. I show that this is the case by investigating experimentally interactive perceptual effects bet...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009